The Stratified Cox Procedure
Introduction
The stratified cox model is a modification of the cox proportional hazards (PH) model that allows for control by stratification of a predictor that does not satisfy by the PH assumption. Predictors that are assumed to satisfy the PH assumption are included in the model, whereas the predictor being stratified is not included.
Example
These are the computer results for a Cox PH model that includes three variables - log WBC, treatment group , and SEX - from a clinical trial of 42 leukemia patients. The goal of the trial was to determine the number of days patients remained in remission.
The P(PH) values for logWBC and treatment group were found to be non-significant, but the P(PH) value for SEX was significant at a level at 0.05. This suggests that logWBC and treatment group conform to the PH assumption, while SEX does not. These findings were supported by the graphical procedures discussed earlier.
As one of the predictors did not meet the PH assumption, we performed a stratified Cox (SC) procedure for analysis. This allowed us to control for the variable that did not meet the PH assumption (SEX) by stratification, while including the logWBC and treatment variables that did meet the PH assumption in the model.
The General Stratified Cox(SC) Model
We assume that we have variables not satisfying the PH assumption and variables satisfying the PH assumption. The variables not satisfying the PH assumption we denote as ; the variables satisfying the PH assumption we denote as .
To perform the stratified Cox procedure, we define a single new variable, which we call , from the 's to be used for stratification. We do this by forming categories of each , including those that are interval variables. We then form combinations of categories, and these combinations are our strata. These strata are the categories of the new variable .
For example, suppose is 2, and the two 's are age (an interval variable) and treatment status (a binary variable). Then we categorize age into, say, three age groups - young, middle, and old. We then form six age group-by-treatment status combinations, as shown here. These six combinations represent the different categories of a single new variable that we stratify on in our stratified Cox model. We call this new variable .
In general, the stratification variable will have categories, where is the total number of combinations (or strata) formed after categorizing each of the 's. In the above example, is equal to 6.
We now present the general hazard function form for the stratified Cox model, as shown here. This formula contains a subscript g which indicates the -th stratum. The strata are defined as the different categories of the stratification variable , and the number of strata equals .
The general SC model:
Note that the variable is not explicitly included in the model but that the 's, which are assumed to satisfy the PH assumption, are included in the model.
Note also that the baseline hazard function is allowed to be different for each stratum. However, the coefficients are the same for each stratum.
A Graphical View of the Stratified Cox Approach
In this section we examine four log-log survival plots illustrating the assumptions underlying a stratified Cox model with or without interaction. Each of the four models considers two dichotomous predictors: treatment (coded for placebo and for new treatment) and SEX (coded 0 for females and 1 for males). The four models are as follows.
This model assumes the PH assumption for both RX and SEX and also assumes no interaction between RX and SEX. Notice all four log-log curves are parallel (PH assumption) and the effect of treatment is the same for females and males (no interaction). The effect of treatment (controlling for SEX) can be interpreted as the distance between the log-log curves from to , for males and for females, separately.
This model assumes the PH assumption for both RX and SEX and allows for interaction between these two variables. All four log-log curves are parallel (PH assumption) but the effect of treatment is larger for males than females as the distance from to is greater for males.
This is a stratified Cox model in which the PH assumption is not assumed for SEX. Notice the curves for males and females are not parallel. However, the curves for RX are parallel within each stratum of SEX indicating that the PH assumption is satisfied for RX. The distance between the log-log curves from to is the same for males and females indicating no interaction between and SEX.
(g = 1 for males, g = 0 for females)
This is a stratified Cox model allowing for interaction of RX and SEX. The curves for males and females are not parallel although the PH assumption is satisfied for RX within each stratum of SEX.The distance between the log-log curves from to is greater for males than females indicating interaction between RX and SEX.
(g = 1 for males, g = 0 for females)